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Abstract 

i-^ I In this paper we analyze the bipartite network of countries and products from UN data on country pro- 

duction [nil]. We define the country-country and product-product projected networks and introduce a 
novel method of filtering information based on elements' similarity. As a result we find that country 
clustering reveals unexpected socio-gcographic links among the most competing countries. On the same 
footings the products clustering can be efficiently used for a bottom-up classification of produced goods. 
Furthermore we mathematically reformulate the "reflections method" introduced by Hidalgo and Haus- 
mann [2] as a fixpoint problem; such formulation highlights some conceptual weaknesses of the approach. 
^Lj' To overcome such an issue, we introduce an alternative methodology (based on biased Markov chains) 

(~| I that allows to rank countries in a conceptually consistent way. Our analysis uncovers a strong non-linear 

interaction between the diversification of a country and the ubiquity of its products, thus suggesting the 
possible need of moving towards more efficient and direct non-linear fixpoint algorithms to rank countries 
and products in the global market. 
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Introduction 

Complex Networks 

Networks emerged in the recent years as the main mathematical tool for the description of complex 
systems. In particular, the mathematical framework of graph theory made possible to extract relevant 
information from different biological and social systems [3111] • In this paper we use some concepts of 
network theory to address the problem of economic complexity [SHZ] • 

Such activity is in the track of a long-standing interaction between economics and physical sciences 
[8lI12j and it explains, extends and complements a recent analysis done on the network of trades between 
nations [HE]. Hidalgo and Hausmann (HH) address the problem of competitiveness and robustness of 
different countries in the global economy by studying the differences in the Gross Domestic Product and 
assuming that the development of a country is related to different "capabilities" . While countries cannot 
directly trade capabilities, it is the specific combination of those capabilities that results in different 
products traded. More capabilities are supposed to bring higher returns and the accumulation of new 
capabilities provides an exponentially growing advantage. Therefore the origin of the differences in the 
wealth of countries can be inferred by the record of trading activities analyzed as the expressions of the 
capabilities of the countries. 



Revealed Competitive Advantage and the country-product matrix 

We consider here the Standard Trade Classification data for the years in the interval 1992 — 2000. In the 
following we shall analyze the year 2000, but similar results apply for the other snapshots. For the year 
2000 the data provides information on Nc = 129 different countries and Np = 772 different products. 

To make a fair comparison between the trades, it is useful to employ Balassa's Revealed Comparative 
Advantage (RCA) [T3] i.e. the ratio between the export share of product p in country c and the share of 
product p in the world market 

RCA^p = ^^/^ (1) 

p' c' ,p' 

where Xcp represents the dollar exports of country c in product p. 

We consider country c to be a competitive exporter of product p if its Revealed Comparative Advantage 
(RCA) is greater than some threshold value, which we take as 1 as in standard economics literature; 
previous studies have verified that small variations around such threshold do not qualitatively change the 
results. 

The network structure of the country-product competition is given by the semipositive matrix M 
defined as 

f 1 z/ RCA,p > R* 
'''^P ^ \ if RCA,p < R* 

where R* is the threshold {R* — I). 

To such matrix M we can associate a graph whose nodes are divided into two sets {c} of Nc nodes 
(the countries) and {p} of Np nodes (the products) where a link between a node c and a node p exists if 
and only if Mcp = 1, i.e. a bipartite graph. The matrix M is strictly related to the adjacency matrix of 
the country-product bipartite network. 

The fundamental structure of the matrix M is revealed by ordering the rows of the matrix by the 
number of exported products and the columns by the number of exporting countries: doing so, M assumes 
a substantially triangular structure. Such structure reflects the fact that some countries export a large 
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fraction of all products (highly diversified countries), and some products appear to be exported by most 
countries (ubiquitous products). Moreover, the countries that export few products tend to export only 
ubiquitous products, while highly diversified countries are the only ones to export the products that only 
few other countries export. 

This triangular structure is therefore revealing us that there is a systematic relationship between the 
diversification of countries and the ubiquity of the products they make. Poorly diversified countries have 
a revealed comparative advantage (RCA) almost exclusively in ubiquitous products, whereas the most 
diversified countries appear to be the only ones with RCAs in the less ubiquitous products which in 
general are of higher value on the market. It is therefore plausible that such structure reflects a ranking 
among the nations. 

The fact that the matrix is triangular rather than block-diagonal suggests that, as countries become 
more complex, they become more diversified. Countries add more new products to the export mix while 
keeping, at the same time, their traditional productions. The structure of M therefore contradicts most 
of classical macro-economical models predicting always a specialization of countries in particular sectors 
of production (i.e. countries should aggregate in communities producing similar goods) that would result 
in a more or less block-diagonal matrix M . 

In the following, we are going to analyze the economical consequences of the structure of the bipartite 
country-product graph described by M. In particular, we analyze the community structure induced 
by M on the countries and products projected networks. As a second step, we reformulate as a linear 
fixpoint algorithm the HH's reflection method to determine the countries and products respective rankings 
induced by M . In this way we are able to clarify the critical aspects of this method and its mathematical 
weakness. Finally, to assign proper weights to the countries, we formulate a mathematically well defined 
biased Markov chain process on the country-product network; to account for the bipartite structure of 
the network, we introduce a two parameter bias in this method. To select the optimal bias, we compare 
the results of our algorithm with a standard economic indicator, the gross domestic product GDP. The 
optimal values of the parameters suggests a highly non-linear interaction between the number of different 
products produced by each country [diversification) and the number of different countries producing each 
product {ubiquity) in determining the competitiveness of countries and products. This fact suggests that, 
to better capture the essential features of economical competition of countries, we need of a more direct 
and efficient non-linear approach. 



Results 

The network of countries 

In order to obtain an immediate understanding of the economic relations between countries induced 
by their products a possible approach is to define a projection graph obtained from the original set of 
bipartite relations represented by the matrix M [14j . The idea is to connect the various countries with 
a link whose strength is given by the number of products they mutually produce. In such a way the 
information stored in the matrix M is projected into the network of countries as shown in Fig. 1. 

The country network can be characterized by the {Nq x Nq) country- country matrix C — AIM^ . 
The non-diagonal elements Ccd correspond to the number of products that countries c and c' have 
in common (i.e. are produced by both countries). They are a measure of their mutual competition, 
allowing a quantitative comparison between economic and financial systems jl5| ; the diagonal elements 
Ccc corresponds to the number of products produced by country c and are a measure of the diversification 
of country c. 

To quantify the competition among two countries, we can define the similarity matrix among countries 
as ^ 

^cc' — '^'Fi "77; • (3) 
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Note that < S"^, < 1 and that small (large) values indicate small (large) correlations between the 
products of the two countries c and c'. Similar approaches to define a correlation between vertices or a 
distance |16j have often been employed in the field of complex networks, for example to detect protein 
correlations |17] or to characterize the interdependencies among clinical traits of the orofacial system |18j . 

The first problem for large correlation networks is how to visualize the relevant structure. The 
simplest approach to visualize the most similar vertices is realized by building a Minimal Spanning 
Tree (MST) [T9l[20]. In this method, starting from an empty graph, edges (c, c') are added in order of 
decreasing similarity until all the nodes are connected; to obtain a tree, edges that would introduce a 
loop are discarded. A further problem is to split the graph in smaller sub-graphs (communities) that 
share important common feature, i.e. have strong correlations. Similarity, like analogous correlation 
indicators, can be used to detect the inner structure of a network; while different methods for community 
detection vary in their detailed implementation |2 H I22j. they give reasonably similar qualitative results 
when the indicators contain the same information. 

The MST method can be thus generalized in order to detect the presence of communities by adding 
the extra condition that no edge between two nodes that have been already connected to some other node 
is allowed. In this way we obtain a set of disconnected sub-trees (i.e. a forest) embedded in the MST. This 
Minimal Spanning Forest (MSF) method naturally splits the network of countries into separate subsets. 
This method allows for the visualization of correlations in a large network and at the same time performs 
a sort of community detection if not precise, certainly very fast. 

By visual inspection in Fig. 2 we can spot a large subtree composed by developed countries and some 
other subtrees in which clear geographical correlations are present. Notice that each subtree contains 
countries with very similar products, i.e. countries that are competing on the same markets. In particular, 
developing countries seem to be mostly direct competitors of their geographical neighbors. This is a 
general feature of economics systems, even if it is not the most rationale choice [23l[24]: as an example, 
both banks [5S] and countries trade preferentially with similar partners, thereby affecting the whole 
robustness of the system [271128] . This behavior can be reproduced by simple statistical models based on 
agents' fitnesses [29]. 

The network of products 

Similarly to countries, we can project the bipartite graph into a product network by connecting two 
products if they are produced by the same one or more countries giving a weight to this link proportional 
to the number of countries producing both products. Such network can be represented by the {Np x 
Np) product-product matrix P = M'^ M. The non-diagonal elements Pppi correspond to the number of 
countries producing both p and p' have in common, while the diagonal elements Ppp corresponds to the 
number of countries producing p. 

In analogy with Eq. ([3]) , the similarity matrix among products is defined as 

5-^^,-2 ^ (A) 

pp ' p p 

It indicates how much products are correlated on a market: a value 3^^, = 1 indicates that whenever 
product p is present on the market of a country, also product p' would be present. This could be for 
example the case of two products p, p' that are both necessary for the same and only industrial process. 

As in the case of countries, the MSF algorithm can be applied to visualize correlations and detect 
communities. In the case of the product network this analysis brings to an apparently contradictory 
results: let's see why. Products are officially characterized by a hierarchical topology assigned by UN. 
Within this classification similar issue as "metalliferous ores and metal scraps" (groups 27. xx) are in 
a totally different section with respect to "non ferrous metals" (groups 68. xx). By applying our new 
algorithm, based on the economical competition network M, one would naively expect that products 
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belonging to the same UN hierarchy should belong to the same community and vice-versa; therefore, if 
we would assign different colors to different UN hierarchies, one would expect all the nodes belonging to 
a single community to be of the same color. In Fig. 3 we show that this is not the case. Such a paradox 
can be understood by analyzing in closer detail the detected communities with the MSF method. As an 
example, we show in Fig. 4 a large community where most of the vertices belong to the area of "vehicle 
part and constituents" . In this cluster we can spot the noticeable presence of a vertex belonging to "food" 
hierarchy. This apparent contradiction is solved up by noticing that such vertex refers to colza seeds, a 
typical plant recently used mostly for bio-fuels and not for alimentation: our MSF method has correctly 
positioned this "food" product in the "vehicle" cluster. Therefore, methods based on community detection 
could be considered as a possible rational substitute for current top-down " human-made" taxonomies [29j . 

Ranking Countries and Products by Reflection Method 

Hidalgo and Haussman (HH) have introduced in pQ[2] the fundamental idea that the complex set of capa- 
bilities of countries (in general hardly comparable between different countries) can be inferred from the 
structure of matrix M (that we can observe). In this spirit, ubiquitous products require few capabilities 
and can be produced by most countries, while diversified countries possess many capabilities allowing 
to produce most products. Therefore, the most diversified countries are expected to be amongst the 
top ones in the global competition; on the same footing ubiquitous products are likely to correspond to 
low-quality products. 

In order to refine such intuitions in a quantitative ranking among countries and products, the authors 
of [Tl[5] have introduced two quantities: the n*'' level diversification dc"'' (called fcc,n in [I1I2) of the 
country c and the n*'' level ubiquity Up"-* (called kp^n in [BE]) of the product p. At the zero*'' order the 
diversification of a country is simply defined as the number of its products or 



where kc is the degree of the node c in the bipartite country-product network); analogously the zero' 
order ubiquity of a product is defined as the number of different countries producing it 



where kp is the degree of the node p in the bipartite country-product network. The diversification kc is 
intended to represent the zero*'' order measure of the "quality" of the country c with the idea that the 
more products a country exports the strongest its position on the marker. The ubiquity kp is intended 
to represent the zero*'' order measure of the "dis-value of the product p in the global competition with 
the idea that the more countries produce a product, the least is its value on the market. 

In the original approach these two initial quantities are refined in an iterative way via a so-called 
"reflections method", consisting in defining the diversification of a country at the (n -t- 1)*'' iteration as 
the average ubiquity of its product at the n*'' iteration and the ubiquity of a country at the {n + 1)*'' 
iteration as the average diversification of its producing countries at the n*'' iteration: 




(5) 



p=i 




(6) 



c=l 



(n) 



< 



(7) 



(n+l) 



k. 
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In vectorial form, this can be cast in the foUowing form 



(8) 



where d^"^ is the iVc— dimensional vector of components dr', u(") is the A'p— dimensional vector of 
components and where we have called Ja = CM and Jb = PM^ (the upper suffix t stands for 

"transpose"), with C and P respectively the x and A'p x Np square diagonal matrices defined by 
Ccc' = I'^cc' and Pppi = kp 15pp'. 

Such an approach suffers from some fiaws. The first one is related to the fact that the process is 
defined in a bipartite networks and therefore even and odd iterations have different meanings. In fact, 
let us consider the diversification di^"^ of the c*'' country: as prescribed by the algorithm, di^"^ is the 
average ubiquity of the products of the c*'' country at the 0-th iteration. Therefore countries with most 
ubiquitous (less valuable) products would get an highest 1** order diversification. On the other hand, 
the approximately triangular structure of M tells us that these countries are the same ones with a small 
degree and therefore with a low value of the 0~th order diversification d'"' . As shown to by [HE], this is 
the case also to higher orders; therefore the diversifications at even and odd iterations are substantially 

(2) 

an anti-correlated. Conversely, successive even iterations are positively correlated so that dc looks a 
refinement of d'"^\ di^^ a refinement of di^"* and so on. Same considerations apply to the iterations for 
the ubiquity of products. 

The major flaw in the HH algorithm is that it is a case of a consensus dynamics, i.e. the state of a 
node at iteration t is just the average of the state of its neighbors at iteration t — 1. It is well known 
that such iterations have the uniform state (all the nodes equal) as the natural fixpoint. It is therefore 
puzzling how such "equalizing" procedure could lead to any form of ranking. To solve such a puzzle, let's 
write the HH algorithm as a simple iterative linear system and analyze its behavior. 

Focusing only on even iterations and on diversifications, we can write HH procedure as: 

d(2") = J^JBd(2"-2) = (J^JB)"d(°) = 7?"d(") , (9) 

where H = JaJb = CMP A'P is a A^c x Nc squared matrix. 

The matrix H in Eql9]is a Markovian stochastic matrix when it acts from the right on positive vectors, 
in the sense that every element Hcd > and 

c=l 

In particular for the given M adjacency matrix it is also ergodic. Therefore, its spectrum of eigenvalues 
is bounded in absolute value by its unique upper eigenvalue Ai = 1. Since H acts on d'^""^' from the 
left, the right eigenvector ei corresponding to the largest eigenvalue Ai = 1 is simply a uniform vector 
with identical components, i.e. in the n — > oo limit d'^^") converges to the fixpoint ei where all countries 
have the same asymptotic diversification. 

It is therefore not a case that HH prescribe to stop their algorithm at a finite number of iterations 
and that they introduce as a recipe to consider as the ranking of a country the rescaled version of the 
2n*'' level diversifications [2] 

J(2«) = ^ Z^L_. (10) 

"c (2„) ' l-LUj 

where (i(2") is the arithmetic mean of all di^"-* and cr^'^"-' the standard deviation of the same set. With 
these prescription, HH algorithm seems to converge to an approximately constant value after ^ 16 steps. 
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This observed behavior can be easily be explained by noticing that, in contrast with the erroneous 
statement in [5], finding the fitness by the reflection method can be reformulated as a fix-point problem 
(our Eq. ^ and solved using the spectral properties of a linear system. In fact, since the ergodic Markovian 
nature of H we can order eigenvalues/eigenvectors such that \XnJ < I^nJ !i •■• ^ 1^21 < Ai = 1. 
Therefore, expanding d'^°' in terms of the right eigenvectors {ei, e2, e^^} of H the initial condition 

~ aiBi + a2e2 + ... + un^bn^, 

we can write the 27i-th iterate as 

d(2") = aiei + a2A2e2 + ... + AJJ^^e^v, = aiei + a2A2e2 + O ((A3/A2)") . (U) 

Therefore, at sufficiently large n the ordering of the countries is completely determined by the components 
of 62; notice that such an asymptotic ordering is independent from the initial condition d*-'^^ and therefore 
should be considered as the appropriate fixpoint renormalized fitness d* for all countries. 

What happens to the HH scheme? At sufficiently large n, (d'^"^) « aei and 17^(2,1) oc a2A2e2 + 
((A3/A2)"'); therefore d^^") becomes proportional to 62 (Eq . [TU)) . The number of iterations it needed 
to converge is given by the ratio between A2 and A3 ((A3/A2)** ^ 1; therefore the it ^ 16 iterations 
prescribed by HH are not a general prescription but depend on the structure of the network analyzed. 

Notice also that when the numerical reflection method is used, the renormalized fitness represents 
a deviation ©(Aj) from a constant and can be detected only if it is bigger than the numerical error; 
therefore only "not too big" it can be employed. On the other hand, the spectral characterization we 
propose does not suffer from such a pitfall even when. Similar considerations can be developed for the 
even iterations of the reflection method for the products. 

Biased Markov chain approach and non-hnear interactions 

Having assessed the flaws of HH's method, we investigate the possibility of defining alternative linear 
algorithms able to implement similar economical intuitions about the ranking of the countries while 
keeping a more robust mathematical foundation. In formulating such a new scheme we will keep the 
approximation of linearity for the iterations even though we shall find in the results hints of the non- 
linear nature of the problem. 

Our approach is inspired to the well-known PageRank algorithm [30]. PageRank (named after the 
WWW, where vertices are the pages) is one of the most famous of Bonacich centrality measures |31j . 
In the original PageRank method the ranking of a vertex is proportional to the time spent on it by an 
unbiased random walker (in different contexts [11] analogous measures assess the stability of a firm in a 
business firm network). 

We define the weights of vertices to be proportional to the time that an appropriately biased random 
walker on the network spends on them in the large time limit j32j . As shown below, such weights, 
being the generalization of and fcp, give a measure respectively of competitiveness of countries and 
"dis-quality" (or lack of competitiveness) of products. As the nodes of our bipartite network are entities 
that are logically and conceptually separated (countries and products), we assign to the random walker 
a different bias when jumping from countries to products respect to jumping from products to countries. 

Let us call lUc""* weight of country c at the n*'' iteration and ujp"-* fitness of product p at the n*'' 
iteration. We define the following Markov process on the country-product bipartite network 

wi"+'^ {a, P) = G,p(/3)4") (a, /3) 

(12) 

4"+^) (a, P) = Ef=i G'p,(a)i«^") (a, /3) 
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where the Markov transition matrix G is given by 



(13) 



Here Gcp gives the probability to jump from product p to country c in a single step, and Gpc the 
probabihty to jump from country c to product p also in a single step. Note that Eas. (jl3p define a 
{Nc + A^p)— dimensional connected Markov chain ol period two. Therefore, random walkers initially 
starting from countries, will be found on products at odd steps and on countries at even ones; the reverse 
happens for random walkers starting from products. By considering separately the random walkers 
starting from countries and from products, we can reduce this Markov chain to two ergodic Markov 
chains of respective dimension Nc and Np. In particular, if the walker starts from a country, using a 
vectorial formalism, we can write for the weights of countries 



w(«+i)(a,/3)=f(a,/3)w(")(a,/3) 
where the Nc x Nc ergodic stochastic matrix T is defined by 

Tcc'(a,/3) 



Np 

Y,Gcp{P)Gpc'{a). 

p=i 



At the same time for products we can write 

w("+i)(a,/3) = 5(a,/3)w(")(a,/3)^ 
where the Np x Np ergodic stochastic matrix S is given by 



(14) 



(15) 



(16) 



(17) 



Given the structure of T and S*, it is simple to show that the two matrices share the same eigenvalue 
spectrum which is upper bounded in modulus by the unique eigenvalue = I. For both matrices, the 
eigenvectors corresponding to /ii are the stationary and asymptotic weights {w*{a, 13)} and {it;*(Q;, /?)} 
of the Markov chains. In order to find analytically such asymptotic values, we apply the detailed balance 
condition: 



GpcW* 



which gives 



GcpW* 



V(c,_p) 



'4^A{j:pZiMcpk--)k-P 

'^; = B{j:c=iMcpkc^)k-'^ 



(18) 



(19) 



where A and B are normalization constants. Note that ioi a = (3 — Eq. ([T3t gives the completely 
unbiased random walk for which T — iJ* where H is given in Eq. (|9]). Therefore, in this case Eqs. ([T9| 
become 

<(0,0) ~ kc 

(20) 

u.;(o,o)^fcp, 
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as for the case of unbiased random walks on a simple connected network the asymptotic weight of a node, 
is proportional to its connectivity. Thus, in the case of a = /? = we recover the zero*'* order iteration of 
the HH's reflection method. Note that, in the same spirit of HH, u>*(0, 0) gives a rough measure of the 
competitiveness of country c while w* gives an approximate measure of the dis-quality in the market of 
product p. By continuity, we associate the same meaning of competitiveness/disquality to the stationary 
states w*/w* at different values of a and /3. 

To understand the behavior of our ranking respect to the bias, we have analyzed the mean correlation 
(square of the Pearson coefficient) for the year 1998 (other years give analogous results) between the 
logarithm of the of each country and its weight (Eqs. for different values of a and /? (see 

Fig.©. 

It is interesting to note that the region of large correlations (region inside the contour plot in the 
Fig. E]) is found in the positive quadrant for about 0.2 < a < 1.8 and 0.5 < /3 < 1; in particular the 
maximal value is approximately at a ~ 1.1 and (3 2± 0.8. These results can be connected with the 
approximately "triangular" shape of the matrix A/. In fact, let us rewrite Eqs. (|19p (apart from the 
normalization constant) as: 

where (k~°'^ ^ is the arithmetic average of fc~" of the products exported by country c and {k~^^ ^ is 
the arithmetic average of for countries exporting product p. Since (3 is substantially positive and 
slightly smaller of 1 and a is definitely positive with optimal values around 1, the competitive countries 
will be characterized by a good balance between a high value of kc and a small typical value of kp of its 
products. Nevertheless, since the optimal values of a are distributed up to the region of values much 
larger than 1 (i.e. 1 — /3 is significantly smaller than 1), we see that the major role for the asymptotic 
weight of a country is played by the presence in its portfolio of un-ubiquitous products which alone 
give the dominant contribution to w*. A similar reasoning leads to the conclusion that the dis- value 
(or ugliness) of a product is basically determined by the presence in the set of its producers of poorly 
diversified countries that are basically exporting only products characterized by a low level of complexity. 

Our new approach based on biased Markov chain theory permits thus to implement the interesting 
ideas developed by HH in [2] , on a more solid mathematical basis using the framework of linear iterated 
transformations and avoiding the indicated flaws of HH's "reflection method" . Interestingly, our results 
reveal a strongly non-linear entanglement between the two basic information one can extract from the 
matrix M: diversification of countries and ubiquity of products. In particular, this non-linear relation 
makes explicit an almost extremal influence of ubiquity of products on the competitiveness of a country 
in the global market: having "good" or complex products in the portfolio is more important than to have 
many products of poor value. Furthermore, the information that a product has among its producers some 
poorly diversified countries is nearly sufficient to say that it is a non-complex (dis-valuable) product in 
the market. This strongly non-linear entanglement between diversifications of countries and ubiquities 
of products is an indication of the necessity to go beyond the linear approach in order to introduce more 
sound and direct description of the competition of countries and products possibly based on a suitable 
ah initio non- linear approach characterized by a smaller number of ad hoc assumptions |36j . 

^We are aware that GDP is not an absolute measure of wealth [33) as it does not account directly for relevant quantities 
like the wealth due to natural resources 1341 ). Nevertheless, we expected that GDP monotonically increases with the wealth. 
What network analysis shows is that the number of products is correlated with both quantities. We envisage such kind of 
analysis in order to define suitable policies for underdeveloped countries 1351 . 
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Discussion 

In this paper we applied methods of graph theory to the analysis of the economic productions of coun- 
tries. The information is available in the form of an Nc x Np rectangular matrix M giving the different 
production of the possible Np goods for each of the N^, countries. The matrix M corresponds to a bi- 
partite graph, the country-product network, that can be projected into the country-country network C 
and the product-product network P. By using complex-networks analysis, we can attain an effective 
filtering of the information contained in C and P. We introduce a new filtering algorithm that identifies 
communities of countries with similar production. As an unexpected result, this analysis shows that 
neighboring countries tend to compete over the same markets instead of diversifying. We also show that 
a classification of goods based on such filtering provides an alternative product taxonomy determined by 
the countries' activity. We then study the ranking of the countries induced by the country-product bipar- 
tite network. We first show that HH's reflection method's ranking is the fix-point of a linear process; in 
this way we can avoid some logical and numerical pitfalls and clarify some of its weak theoretical points. 
Finally, in analogy with the Google PageRank algorithm, we define a biased, two parameters Markov 
chain algorithm to assign ranking weights to countries and products by taking into account the structure 
of the adjacency matrix of the country-product bipartite network. By correlating the fix-point ranking 
(i.e. competitiveness of countries and products) with the GDP of each country, we find that the optimal 
bias parameters of the algorithm indicate a strongly non-linear interaction between the diversification of 
the countries and the ubiquity of the products. 



Materials and Methods 
Graphs 

A graph is a couple G = {V, E) where V = {vi\i = 1 ... n^} is the set of vertices, and E QV xV \s the 
set of edges. A graph G can be represented via its adjacency matrix A 

^ f 1 if an edge exists between Vi and Vj 
[ otherwise. 

The degree fc,; of the node Vi is the number J^j -^ij of its neighbors. 

An unbiased random walk on a graph G is characterized by a probability pij = 1/ki of jumping from 
a vertex Vi to one of its kt neighbors and is described by the jump matrix 

Jg = K-^A, (22) 

where K is the diagonal matrix Kij = kiSij corresponding to the nodes degrees. 

Bipartite Graphs 

A bipartite graph is a triple G = {A, B, E) where A ~ {ai\i ~ 1 . . .ua} and B = {bj\j = 1 . . .ub} are 
two disjoint sets of vertices, and E C A x B is the set of edges, i.e. edges exist only between vertices of 
the two different sets A and B. 

The bipartite graph G can be described by the matrix Af defined as 

j^j- ( 1 if an edge exists between and bj , . 

1 otherwise . 



In terms of M, it is possible to define the adjacency matrix ^ of G as 

,,rT ^ (24) 
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. It is also useful to define the co-occurrence matrices ~ MM'^ and = M'^AI that respectively 
count the number of common neighbors between two vertices of A or of B. P^ is the weighted adjacency 
matrix of the co-occurrence graph with vertices on A and where each non-zero element of P^ cor- 
responds to an edge among vertices and aj with weight P^. The same is valid for the co-occurrence 
matrix P^ and the co-occurrence graph . 

Many projection schemes for a bipartite graph G start from constructing the graphs or and 
eliminating the edges whose weights are less than a given threshold or whose statistical significance is 
low. 



Matrix from RCA 

To make a fair comparison between the exports, it is useful to employ Balassa's Revealed Comparative 
Advantage (RCA) [13] i.e. the ratio between the export share of product p in country c and the share of 
product p in the world market 

RCA^, = ^^/^__ (25) 

p' c' ,p' 

where Xcp represents the dollar exports of country c in product p. 

The network structure is given by the country-product adjacency matrix M defined as 

M - / ^ ^^^-^f > ^* (26) 

" \ tf RCAcp < R* ^ ' 

where R* is the threshold. A positive entry, Mcp ~ 1 tells us that country c is a competitive exporter of 
the product p. 
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Figure 1. The network of countries and products and the two possible projections. 
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Figure 2. The Minimal Spanning Forest for the Countries. The various subgraphs have a 
distinct geographical similarity. Wc show in green northern European countries and in red the "Baltic" 
republics. In general neighboring (also in a social and cultural sense) countries compete for the 
production of similar goods. 
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Figure 3. The Minimal Spanning Forest (MSF) for the Products. Wc put a different color 
according to the first digit used in COMTRADE classification. This analysis should reveal correlation 
between different but similar products. 
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^^641 Glass in the mass ^to^n n ■■ ^ 

A,^ ^ 7310 Passanger Motor cars, 

^,■^^264 Printing Presses transport o^Pass. & Goods 



^7161 Motors and Generators, DC\ 

7783 Electr. equip, for int. 
comb, engines & vehicles 

5331 Other colouring matter 

9 « 

5139 Carboxylic acids with 5825 Polyurethanes 
Alcohol, Phenol etc. ^ 

7492 Taps, Cocks, Valves 
etc for Pipes, Ti^s^Jfats etc 



6691 Locksmiths safes & 
hardware, of base metal 



5831 Monofil. palym. af 



Electrical Insulating 
equipment 



7373 Welding, Brazing, Cutting, 
Soldering Machines 

^442 Lifting, Handling, ^ 
, Loading Mach. Conveyors " 793 Steel Iron Forglngs 

^ Stampings (rough) 

6993 Seml-manuf. 8e artlc. of Copper, 
Nickel, Aluminium, Lead, Zinc and Tin 

^^951 Office supplies of base metal 
l^ Electrical app. (switches, relays, fuses, etc.) 
149 Other Parts and access, of motor vehicles 

7139 Parts for the int. combustion piston engines 





119 RaiiyTram. track fixtures 



7831 Public Serv. type Pass. Motor Vehicles 



7321 Motor vehicles for 



^^^"^"^^352 Casks, barrels, vats, tubs 
2226 Rape & Colza seeds 



7132 Int. combustion piston 
engines for prop, vehicles 



transport of goods 



^83: 



7915 Rail/Tram, freight 
and manteinance cars 



'832 Road tractors and semi-trailers 

8732 Revolution counters, taximeters, mileometers 

8743 Non electr. Instr. for meas^check. the flow 

6994 Springs and leaves for springs, 
of iron, steel or copper 



Figure 4. The largest tree in the Products MSF. When passing from classification colors to the 
real products name, we see they are all strongly related. It is interesting the presence of colza seeds in 
the lower left corner of the figure. 



17 



1998 




a 



Figure 5. The plot of the mean Correlation (square of Pearson coefficient, R2) between 
logarithm of GDP and fixpoint weights of countries in the biased (Markovian) random 
walk method as a function of parameters a and /3. The contour plot for a level of R2 ~ 0.4 is 
indicated as a green loop in the orange region. 



